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Abstract. Supersymmetry (SUSY) is an attractive extension of the Standard Model possibly 
solving many standing issues in particle physics and cosmology. The general purpose ATLAS 
detector at the Large Hadron Collider (LHC) is an experiment capable of discovering or 
excluding TeV SUSY. However discovery can only be claimed when the Standard Model 
backgrounds are understood and are under control. The expectations at the LHC are that 
Monte Carlo simulation predictions may not be sufficient to achieve this and the backgrounds 
will have to be determined from data itself. In this note we will highlight some data driven 
methods developed to estimate backgrounds and detect a possible SUSY excess. 



1. Introduction 

In order to prevent rapid proton decay a new quantum number called R-parity is conserved in 
SUSY models, with two clear consequences for SUSY searches with ATLAS. The first one is 
the lightest supersymmetric particle is absolutely stable, giving a signature of missing energy in 
the detector. Secondly we expect SUSY events to exhibit relatively large multiplicity of high pi 
jets, as SUSY particles with strong interaction couplings have highest production rates at the 
LHC and these will go through long decay chains. Furthermore it was decided to study SUSY 
with different search strategies based on exclusive requirement of zero, one or more leptons. 

The aim of the described studies was to develop data driven techniques to estimate the major 
Standard Model (SM) backgrounds such as W/Z bosons with associated jets, top quark pair 
production and jet production from QCD processes. SUSY should show up as an excess over the 
predicted SM events in the so-called "signal" region, where new physics may be present. The 
prediction is done by extrapolating from a "control" region, which should be as close as possible 
to the signal region, give an unbiased estimate, have sufficient statistics and small theoretical 
uncertainties. The first requirement has the effect of contaminating the control region with 
SUSY events, which results in overestimating the SM background in the signal region, but this 
contamination can be taken into account. 

We assumed the LHC running with ^/s = 14 TeV/c^ and a collected integrated luminosity 
of 1 fb~^. Due to space limitations the methods described in this note are a subset of all the 
methods developed and the reader is referred to [1] for a complete description. 

2. One-lepton search mode 

The one-lepton search mode requires an isolated lepton. This will strongly suppress the QCD 
background that dominates over other processes by orders of magnitude and has high theoretical 



and instrumental uncertainties. A lepton in our studies is either an electron or a muon with 
Pt of at least 20 GeV/c. To avoid overlap with the di-lepton mode, we veto events with a 
second identified lepton with a of more than 10 GeV/c. Assuming R-parity conservation we 
demand at least four jets with \r]\ < 2.5 and pT > 50 GeV/c out of which one jet must have 
Pt > 100 GeV/c. The missing transverse energy E^^^^ should be larger than 100 GeV/c and 
above 0.2Mefj, where M^s is the effective mass. Our last selection criterium is that transverse 
sphericity St is larger than 0.2. All definitions of variables can be found in [1]. 

2.1. Combined fit method 

After the selection the only non-negligible backgrounds left are W bosons with associated 
jets (W+jets) and top quark pairs. The latter we divide into two categories: semileptonic 
{tt hhivqq) and dileptonic {ti — > bMuiv) top quark pairs, as these have different shapes 
and yields. Dileptonic top quark pair events end up in our one lepton sample because either 
the second lepton was mis-/not reconstructed or the second lepton was a tau lepton decaying 
hadronically, respectively constituting one third and two-thirds of all dileptonic events. 

We fit the SM backgrounds in three observables: E^^^^, Mt and rritop- Mt is the invariant 
mass of EJ^^^^ vector and the lepton pi, while mtop is the invariant mass of the three jets with 
largest vector-summed px- 

Taking physics features into account we construct probability density functions (p.d.f.'s) that 
model each of the three main backgrounds in the three observables. For example the semileptonic 
tt shows a clear peak in mtop if we find the correct three quarks from top quark decay. 

For a broad range of SUSY parameters it was observed that the shapes of low Ej^^^^ and low 
Mt distributions have little dependence with the chosen model point. Thus we can construct a 
model-independent Ansatz shape to describe the SUSY contamination at low energy. 

The combined model used in the fit is the addition of p.d.f.'s of each background sample and 
the SUSY Ansatz shape with a yield parameter for each separate component. The first step in 
the procedure is to fit the Ej^^^^, Mt and mtop distributions with shapes obtained from Monte 
Carlo simulation. To make it more data-driven is to release in the fit as many of the shape 
parameters as possible. Figure [T] shows the result of the fit with floating yields and floating 
shape parameters to 1 fb~^ of data. 

The final step is to extrapolate the yields of the SM background components to the 
signal region. Table 12.11 shows the extrapolated yields to the signal region (SIG), defined 
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Figure 1. Distribution of i?™'^^ (left), m-T (center) and mtop (right) of a 1 fb^^ mix of tt 
and W-|-jets SM events and SUSY SU3 events overlaid with projections of the combined model 
fitted to the mix of events with floating yield and shape parameters. For each projection the 
contributions of the semileptonic tt contribution (dark blue), dileptonic tt contribution (light 
blue), W-|-jets contrubution (red) and Ansatz SUSY constribution (black) are shown. 



Table 1. Yields from the combined fit with either fixed or floating shapes extrapolated to the 
full parameter space, the truth yields in full parameter space, the extrapolated yields into the 
signal region and the truth yields in the signal region. 



Component 


Extrap. Yield in FULL 


True 


Extrap. Yield in SIG 


True 




Shape Fixed 


Shape Floating 


FULL 


Shape Fixed 


Shape Floating 


SIG 


W + jets 


205 ± 45 


227 ± 68 


173 


0.5 ±0.4 


-1.2 ±2.7 


2 


1-lepton ti 


476 ± 35 


485 ± 59 


502 


0.4 ±0.2 


-1.1 ±3.9 





2-lepton tt 


62 ±38 


17 ±54 


70 


4.5 ±2.9 


4.7 ± 7.9 


5 


SUSY SU3 


273 ± 33 


287 ± 38 


271 


92.7 ±2.8 


95.6 ±4.0 


91 



by i?™'^^ > 200 GeV/c and Mt > 150 GeV/c^, while propagating all correlated parameter 
uncertainties with fixed and floating shapes. For comparison the same table shows extrapolated 
yields to the full parameter space (FULL) from the control region. The yields that we find are 
in good agreement with the truth values of the fitted event mix within statistical uncertainties. 

2.2. HT2 method 

If we add an extra cut of Mr > 100 GeV/c^ to our event selection the only significant background 
we are left with is dileptonic tt. The HT2 method estimates this background by using two near 
independent variables HT2 and Ej^^^^ significance defined as: 



HT2 = J2i^t' + Pt^*°°' ^T*''significance = E^''y[0.49 ■ ^JY^Et] (1) 

i=2 

The leading jet pT was excluded from IIT2 in order to reduce the correlation with Ej^^^^. The 
correlation between the highest-pT jet and E^^^^ is a consequence of kinematics, as the rest of 
the event recoils from this jet. As the E^^^^ resolution depends on J2 that is clearly related 
to HT2, E^^^^ significance was used instead of E^^^^ to further remove any correlation. 
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Figure 2. Left: Histogram is observed E^^^^ significance distribution, open circles are true 
SUSY signal, blue triangles are true SM background, black filled circles is the estimated 
background. Right: Open circles is true SUSY signal as a function of £^™'^^ significance, black 
is estimated SUSY excess obtained by subtracting the estimated background from the observed 
E^^^^ significance distribution. 



A control sample is defined by HT2< 300 GeV/c from which the shape of E!f^^^ significance 
is taken. The assumption is that this shape is independent of HT2. The normalization of this 
shape is obtained by comparing the event counts of the control sample to the signal sample at 
8 < significance < 14. This low E^^^^ significance region is equivalent to a low Ej^^^^ region 
almost unpopulated by SUSY. 

The background distribution and the excess of SUSY signal in high E^^^^ significance is shown 
in Figure [2|^left). The background is somewhat overestimated due to SUSY contamination in 
the control sample. However the right histogram of Figure [2] shows that if we would cut higher 
on Ej^^^^ significance, the SUSY signal would still be clearly seen over the estimated background. 

3. No-lepton search mode 

As the name suggests we veto all events with an identified muon or electron of > 20 GeV/c. 
For the rest the no-lepton mode event selection is equivalent to one-lepton mode concerning jets, 
j^miss^ S't and M^s- To pass the E^^^^ > 100 GeV/c requirement QCD events must contain a 
poorly reconstructed jet which can be caused by dead material, jet punch-through, pile-up of 
machine backgrounds and others. E^^^^ will then point to or away from this poorly reconstructed 
jet. To reject these QCD events the value of the difference in azimuthal angle between the E!f^^^ 
vector and each of the three highest-pT jets is required to be larger than 0.2. 



3.1. Replace method 

The Z ^ vv with associated jets process is one of the main backgrounds in the no-lepton 
search. To estimate its shapes and expected number of events Z I'^t' events are selected by 
requiring: two opposite charged leptons, invariant mass of the two leptons within 10 GeV/c^ 
of the Z-mass and ET^^^^ < 30 GeV/c. Then the charged leptons are replaced by neutrinos, 
variables are recalculated and events go through the no-lepton selection procedure again. 
Four corrections must be applied to get a correct estimate, as summarized by this formula: 



M (pmiss-. Nz^i+e-jPTje+i-)) (^(7\\^n (7\\^ Br{Z ^ vv) 

I\Z~,uu[ErY ) = — r — X CKm(PT(Z)) X CFidu (PT j j X 



eff(%+,PT,£+) • efr(%-,PT,f-) Bt{Z^£+£ ) 

(2) 

where Nz^uv is the corrected number of events per bin of E^^^^, Nz-^e+c- is the raw number 
of control sample events as a function pt{Z), cxin is the kinematic correction due to extra 
selection criteria, cpidu is the fiducial correction since we cannot detect leptons beyond \r]\ < 2.5, 
eff (%+ , Pt,£+ ) eff {r]£- , Pt,£- ) corrections for lepton reconstruction efficiency as a function 
of rj and pT and finally we correct for the difference in branching ratios of the Z boson. 

This method does not suffer from SUSY contamination as the stringent control sample 
selection cuts make it practically free of SUSY. The described procedure estimates the Z ^ vv 
background correctly but its precision is limited by statistics in the control sample. 



4. Conclusion 

What we have shown in this note are a number of methods to estimate the top quark pair, 
W+jets, Z+jets and QCD backgrounds from data in regions where we expect to find a SUSY 
excess. For an integrated luminosity of 1 fb^^ many complementary methods have been 
developed and are shown to reliably assess the SM backgrounds as well as the possible SUSY 
excess. Ongoing work focussed on the expected early LHC running scenario with a integrated 
luminosity of 200 pb~^ at collision energies close to 10 TeV/c^ shows promising results. 
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